Recently, India has experienced an increase in COVID-19 cases and deaths. With insufficient medical resources and poor living conditions, many people have to unfortunately be turned away from hospitals in favor of people with more severe cases. But until now, it seemed as if India was doing well with the pandemic! So what has happened recently to change that? In this notebook we will explore that question and investigate which locations within India have it the worst and need medical resources the most.
https://github.com/datameet/covid19
This data combines multiple data sources from Indian government websites into a cleaner and more accesible format. The government sources for COVID-19 data are the Ministry of Health & Family Welfare and The Indian Council of Medical Research, or ICMR. The data is stored in multiple files representing different pieces of data, such as the total number of cases as a time series, and the number of cases and deaths per state. Each piece of data is stored as a .json file and will need to be preprocessed so that we can work with it more easily.
We will start by importing some of the libraries we will need. The requests library is used to acquire the .json file from the internet and get its contents in a JSON format. The json library is then used to place that data into a local json file and convert it into a dictionary. Finally, we use pandas to convert the dictionary into a dataframe so that we can more easily plot and visualize the data.
import requests
import json
import pandas as pd
Because the data is stored in a specific JSON format, we need to read the JSON file from the web and read it into a JSON data structure. We will start by acquiring the all_totals.json file, which contains the totals for the number of active cases, number of deaths, number of people cured, and the total number of confirmed cases, all with the associated timestamps.
# Takes a JSON file from the data source and places its contents inside a dictionary
def get_json(json_file):
link = 'https://raw.githubusercontent.com/datameet/covid19/master/data/' + json_file
r = requests.get(link)
data = r.json()
# takes the data from link and saves it in a JSON file
with open(json_file, 'w') as f:
json.dump(data, f)
# takes the JSON file and places the contents in a dictionary
with open(json_file) as f:
data = json.load(f)
return data
data = get_json('all_totals.json')
data
', 'death'], 'value': 20160},
{'key': ['2020-07-07T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 719665},
{'key': ['2020-07-08T08:00:00.00+05:30', 'active_cases'], 'value': 264944},
{'key': ['2020-07-08T08:00:00.00+05:30', 'cured'], 'value': 456831},
{'key': ['2020-07-08T08:00:00.00+05:30', 'death'], 'value': 20642},
{'key': ['2020-07-08T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 742417},
{'key': ['2020-07-09T08:00:00.00+05:30', 'active_cases'], 'value': 269789},
{'key': ['2020-07-09T08:00:00.00+05:30', 'cured'], 'value': 476378},
{'key': ['2020-07-09T08:00:00.00+05:30', 'death'], 'value': 21129},
{'key': ['2020-07-09T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 767296},
{'key': ['2020-07-10T08:00:00.00+05:30', 'active_cases'], 'value': 276685},
{'key': ['2020-07-10T08:00:00.00+05:30', 'cured'], 'value': 495513},
{'key': ['2020-07-10T08:00:00.00+05:30', 'death'], 'value': 21604},
{'key': ['2020-07-10T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 793802},
{'key': ['2020-07-11T08:00:00.00+05:30', 'active_cases'], 'value': 283407},
{'key': ['2020-07-11T08:00:00.00+05:30', 'cured'], 'value': 515386},
{'key': ['2020-07-11T08:00:00.00+05:30', 'death'], 'value': 22123},
{'key': ['2020-07-11T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 820916},
{'key': ['2020-07-12T08:00:00.00+05:30', 'active_cases'], 'value': 292258},
{'key': ['2020-07-12T08:00:00.00+05:30', 'cured'], 'value': 534621},
{'key': ['2020-07-12T08:00:00.00+05:30', 'death'], 'value': 22674},
{'key': ['2020-07-12T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 849553},
{'key': ['2020-07-13T08:00:00.00+05:30', 'active_cases'], 'value': 301609},
{'key': ['2020-07-13T08:00:00.00+05:30', 'cured'], 'value': 553471},
{'key': ['2020-07-13T08:00:00.00+05:30', 'death'], 'value': 23174},
{'key': ['2020-07-13T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 878254},
{'key': ['2020-07-14T08:00:00.00+05:30', 'active_cases'], 'value': 311565},
{'key': ['2020-07-14T08:00:00.00+05:30', 'cured'], 'value': 571460},
{'key': ['2020-07-14T08:00:00.00+05:30', 'death'], 'value': 23727},
{'key': ['2020-07-14T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 906752},
{'key': ['2020-07-15T08:00:00.00+05:30', 'active_cases'], 'value': 319840},
{'key': ['2020-07-15T08:00:00.00+05:30', 'cured'], 'value': 592032},
{'key': ['2020-07-15T08:00:00.00+05:30', 'death'], 'value': 24309},
{'key': ['2020-07-15T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 936181},
{'key': ['2020-07-16T08:00:00.00+05:30', 'active_cases'], 'value': 331146},
{'key': ['2020-07-16T08:00:00.00+05:30', 'cured'], 'value': 612815},
{'key': ['2020-07-16T08:00:00.00+05:30', 'death'], 'value': 24915},
{'key': ['2020-07-16T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 968876},
{'key': ['2020-07-17T08:00:00.00+05:30', 'active_cases'], 'value': 342473},
{'key': ['2020-07-17T08:00:00.00+05:30', 'cured'], 'value': 635757},
{'key': ['2020-07-17T08:00:00.00+05:30', 'death'], 'value': 25602},
{'key': ['2020-07-17T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1003832},
{'key': ['2020-07-18T08:00:00.00+05:30', 'active_cases'], 'value': 358692},
{'key': ['2020-07-18T08:00:00.00+05:30', 'cured'], 'value': 653751},
{'key': ['2020-07-18T08:00:00.00+05:30', 'death'], 'value': 26273},
{'key': ['2020-07-18T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1038716},
{'key': ['2020-07-19T08:00:00.00+05:30', 'active_cases'], 'value': 373379},
{'key': ['2020-07-19T08:00:00.00+05:30', 'cured'], 'value': 677423},
{'key': ['2020-07-19T08:00:00.00+05:30', 'death'], 'value': 26816},
{'key': ['2020-07-19T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1077618},
{'key': ['2020-07-20T08:00:00.00+05:30', 'active_cases'], 'value': 390459},
{'key': ['2020-07-20T08:00:00.00+05:30', 'cured'], 'value': 700087},
{'key': ['2020-07-20T08:00:00.00+05:30', 'death'], 'value': 27497},
{'key': ['2020-07-20T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1118043},
{'key': ['2020-07-21T08:00:00.00+05:30', 'active_cases'], 'value': 402529},
{'key': ['2020-07-21T08:00:00.00+05:30', 'cured'], 'value': 724578},
{'key': ['2020-07-21T08:00:00.00+05:30', 'death'], 'value': 28084},
{'key': ['2020-07-21T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1155191},
{'key': ['2020-07-22T08:00:00.00+05:30', 'active_cases'], 'value': 411133},
{'key': ['2020-07-22T08:00:00.00+05:30', 'cured'], 'value': 753050},
{'key': ['2020-07-22T08:00:00.00+05:30', 'death'], 'value': 28732},
{'key': ['2020-07-22T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1192915},
{'key': ['2020-07-23T08:00:00.00+05:30', 'active_cases'], 'value': 426167},
{'key': ['2020-07-23T08:00:00.00+05:30', 'cured'], 'value': 782607},
{'key': ['2020-07-23T08:00:00.00+05:30', 'death'], 'value': 29861},
{'key': ['2020-07-23T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1238635},
{'key': ['2020-07-24T08:00:00.00+05:30', 'active_cases'], 'value': 440135},
{'key': ['2020-07-24T08:00:00.00+05:30', 'cured'], 'value': 817209},
{'key': ['2020-07-24T08:00:00.00+05:30', 'death'], 'value': 30601},
{'key': ['2020-07-24T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1287945},
{'key': ['2020-07-25T08:00:00.00+05:30', 'active_cases'], 'value': 456071},
{'key': ['2020-07-25T08:00:00.00+05:30', 'cured'], 'value': 849432},
{'key': ['2020-07-25T08:00:00.00+05:30', 'death'], 'value': 31358},
{'key': ['2020-07-25T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1336861},
{'key': ['2020-07-26T08:00:00.00+05:30', 'active_cases'], 'value': 467882},
{'key': ['2020-07-26T08:00:00.00+05:30', 'cured'], 'value': 885577},
{'key': ['2020-07-26T08:00:00.00+05:30', 'death'], 'value': 32063},
{'key': ['2020-07-26T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1385522},
{'key': ['2020-07-27T08:00:00.00+05:30', 'active_cases'], 'value': 485114},
{'key': ['2020-07-27T08:00:00.00+05:30', 'cured'], 'value': 917568},
{'key': ['2020-07-27T08:00:00.00+05:30', 'death'], 'value': 32771},
{'key': ['2020-07-27T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1435453},
{'key': ['2020-07-28T08:00:00.00+05:30', 'active_cases'], 'value': 496988},
{'key': ['2020-07-28T08:00:00.00+05:30', 'cured'], 'value': 952743},
{'key': ['2020-07-28T08:00:00.00+05:30', 'death'], 'value': 33425},
{'key': ['2020-07-28T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1483156},
{'key': ['2020-07-29T08:00:00.00+05:30', 'active_cases'], 'value': 509447},
{'key': ['2020-07-29T08:00:00.00+05:30', 'cured'], 'value': 988029},
{'key': ['2020-07-29T08:00:00.00+05:30', 'death'], 'value': 34193},
{'key': ['2020-07-29T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1531669},
{'key': ['2020-07-30T08:00:00.00+05:30', 'active_cases'], 'value': 528242},
{'key': ['2020-07-30T08:00:00.00+05:30', 'cured'], 'value': 1020582},
{'key': ['2020-07-30T08:00:00.00+05:30', 'death'], 'value': 34968},
{'key': ['2020-07-30T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1583792},
{'key': ['2020-07-31T08:00:00.00+05:30', 'active_cases'], 'value': 545318},
{'key': ['2020-07-31T08:00:00.00+05:30', 'cured'], 'value': 1057805},
{'key': ['2020-07-31T08:00:00.00+05:30', 'death'], 'value': 35747},
{'key': ['2020-07-31T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1638870},
{'key': ['2020-08-01T08:00:00.00+05:30', 'active_cases'], 'value': 565103},
{'key': ['2020-08-01T08:00:00.00+05:30', 'cured'], 'value': 1094374},
{'key': ['2020-08-01T08:00:00.00+05:30', 'death'], 'value': 36511},
{'key': ['2020-08-01T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1695988},
{'key': ['2020-08-02T08:00:00.00+05:30', 'active_cases'], 'value': 567730},
{'key': ['2020-08-02T08:00:00.00+05:30', 'cured'], 'value': 1145629},
{'key': ['2020-08-02T08:00:00.00+05:30', 'death'], 'value': 37364},
{'key': ['2020-08-02T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1750723},
{'key': ['2020-08-03T08:00:00.00+05:30', 'active_cases'], 'value': 579357},
{'key': ['2020-08-03T08:00:00.00+05:30', 'cured'], 'value': 1186203},
{'key': ['2020-08-03T08:00:00.00+05:30', 'death'], 'value': 38135},
{'key': ['2020-08-03T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1803695},
{'key': ['2020-08-04T08:00:00.00+05:30', 'active_cases'], 'value': 586298},
{'key': ['2020-08-04T08:00:00.00+05:30', 'cured'], 'value': 1230509},
{'key': ['2020-08-04T08:00:00.00+05:30', 'death'], 'value': 38938},
{'key': ['2020-08-04T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1855745},
{'key': ['2020-08-05T08:00:00.00+05:30', 'active_cases'], 'value': 586244},
{'key': ['2020-08-05T08:00:00.00+05:30', 'cured'], 'value': 1282215},
{'key': ['2020-08-05T08:00:00.00+05:30', 'death'], 'value': 39795},
{'key': ['2020-08-05T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1908254},
{'key': ['2020-08-06T08:00:00.00+05:30', 'active_cases'], 'value': 595501},
{'key': ['2020-08-06T08:00:00.00+05:30', 'cured'], 'value': 1328336},
{'key': ['2020-08-06T08:00:00.00+05:30', 'death'], 'value': 40699},
{'key': ['2020-08-06T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 1964536},
{'key': ['2020-08-07T08:00:00.00+05:30', 'active_cases'], 'value': 607384},
{'key': ['2020-08-07T08:00:00.00+05:30', 'cured'], 'value': 1378105},
{'key': ['2020-08-07T08:00:00.00+05:30', 'death'], 'value': 41585},
{'key': ['2020-08-07T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2027074},
{'key': ['2020-08-08T08:00:00.00+05:30', 'active_cases'], 'value': 619088},
{'key': ['2020-08-08T08:00:00.00+05:30', 'cured'], 'value': 1427005},
{'key': ['2020-08-08T08:00:00.00+05:30', 'death'], 'value': 42518},
{'key': ['2020-08-08T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2088611},
{'key': ['2020-08-09T08:00:00.00+05:30', 'active_cases'], 'value': 628747},
{'key': ['2020-08-09T08:00:00.00+05:30', 'cured'], 'value': 1480884},
{'key': ['2020-08-09T08:00:00.00+05:30', 'death'], 'value': 43379},
{'key': ['2020-08-09T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2153010},
{'key': ['2020-08-10T08:00:00.00+05:30', 'active_cases'], 'value': 634945},
{'key': ['2020-08-10T08:00:00.00+05:30', 'cured'], 'value': 1535743},
{'key': ['2020-08-10T08:00:00.00+05:30', 'death'], 'value': 44386},
{'key': ['2020-08-10T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2215074},
{'key': ['2020-08-11T08:00:00.00+05:30', 'active_cases'], 'value': 639929},
{'key': ['2020-08-11T08:00:00.00+05:30', 'cured'], 'value': 1583489},
{'key': ['2020-08-11T08:00:00.00+05:30', 'death'], 'value': 45257},
{'key': ['2020-08-11T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2268675},
{'key': ['2020-08-12T08:00:00.00+05:30', 'active_cases'], 'value': 643948},
{'key': ['2020-08-12T08:00:00.00+05:30', 'cured'], 'value': 1639599},
{'key': ['2020-08-12T08:00:00.00+05:30', 'death'], 'value': 46091},
{'key': ['2020-08-12T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2329638},
{'key': ['2020-08-13T08:00:00.00+05:30', 'active_cases'], 'value': 653622},
{'key': ['2020-08-13T08:00:00.00+05:30', 'cured'], 'value': 1695982},
{'key': ['2020-08-13T08:00:00.00+05:30', 'death'], 'value': 47033},
{'key': ['2020-08-13T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2396637},
{'key': ['2020-08-14T08:00:00.00+05:30', 'active_cases'], 'value': 661595},
{'key': ['2020-08-14T08:00:00.00+05:30', 'cured'], 'value': 1751555},
{'key': ['2020-08-14T08:00:00.00+05:30', 'death'], 'value': 48040},
{'key': ['2020-08-14T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2461190},
{'key': ['2020-08-15T08:00:00.00+05:30', 'active_cases'], 'value': 667950},
{'key': ['2020-08-15T08:00:00.00+05:30', 'cured'], 'value': 1808936},
{'key': ['2020-08-15T08:00:00.00+05:30', 'death'], 'value': 49036},
{'key': ['2020-08-15T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2525922},
{'key': ['2020-08-16T08:00:00.00+05:30', 'active_cases'], 'value': 677444},
{'key': ['2020-08-16T08:00:00.00+05:30', 'cured'], 'value': 1862258},
{'key': ['2020-08-16T08:00:00.00+05:30', 'death'], 'value': 49980},
{'key': ['2020-08-16T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2589682},
{'key': ['2020-08-17T08:00:00.00+05:30', 'active_cases'], 'value': 676900},
{'key': ['2020-08-17T08:00:00.00+05:30', 'cured'], 'value': 1919842},
{'key': ['2020-08-17T08:00:00.00+05:30', 'death'], 'value': 50921},
{'key': ['2020-08-17T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2647663},
{'key': ['2020-08-18T08:00:00.00+05:30', 'active_cases'], 'value': 670066},
{'key': ['2020-08-18T08:00:00.00+05:30', 'cured'], 'value': 1977779},
{'key': ['2020-08-18T08:00:00.00+05:30', 'death'], 'value': 51797},
{'key': ['2020-08-18T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2699642},
{'key': ['2020-08-19T08:00:00.00+05:30', 'active_cases'], 'value': 676514},
{'key': ['2020-08-19T08:00:00.00+05:30', 'cured'], 'value': 2037870},
{'key': ['2020-08-19T08:00:00.00+05:30', 'death'], 'value': 52889},
{'key': ['2020-08-19T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2767273},
{'key': ['2020-08-20T08:00:00.00+05:30', 'active_cases'], 'value': 686395},
{'key': ['2020-08-20T08:00:00.00+05:30', 'cured'], 'value': 2096664},
{'key': ['2020-08-20T08:00:00.00+05:30', 'death'], 'value': 53866},
{'key': ['2020-08-20T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2836925},
{'key': ['2020-08-21T08:00:00.00+05:30', 'active_cases'], 'value': 692030},
{'key': ['2020-08-21T08:00:00.00+05:30', 'cured'], 'value': 2158946},
{'key': ['2020-08-21T08:00:00.00+05:30', 'death'], 'value': 54849},
{'key': ['2020-08-21T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2905825},
{'key': ['2020-08-22T08:00:00.00+05:30', 'active_cases'], 'value': 697330},
{'key': ['2020-08-22T08:00:00.00+05:30', 'cured'], 'value': 2222577},
{'key': ['2020-08-22T08:00:00.00+05:30', 'death'], 'value': 55794},
{'key': ['2020-08-22T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 2975701},
{'key': ['2020-08-23T08:00:00.00+05:30', 'active_cases'], 'value': 707668},
{'key': ['2020-08-23T08:00:00.00+05:30', 'cured'], 'value': 2280566},
{'key': ['2020-08-23T08:00:00.00+05:30', 'death'], 'value': 56706},
{'key': ['2020-08-23T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3044940},
{'key': ['2020-08-24T08:00:00.00+05:30', 'active_cases'], 'value': 710771},
{'key': ['2020-08-24T08:00:00.00+05:30', 'cured'], 'value': 2338035},
{'key': ['2020-08-24T08:00:00.00+05:30', 'death'], 'value': 57542},
{'key': ['2020-08-24T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3106348},
{'key': ['2020-08-25T08:00:00.00+05:30', 'active_cases'], 'value': 704348},
{'key': ['2020-08-25T08:00:00.00+05:30', 'cured'], 'value': 2404585},
{'key': ['2020-08-25T08:00:00.00+05:30', 'death'], 'value': 58390},
{'key': ['2020-08-25T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3167323},
{'key': ['2020-08-26T08:00:00.00+05:30', 'active_cases'], 'value': 707267},
{'key': ['2020-08-26T08:00:00.00+05:30', 'cured'], 'value': 2467758},
{'key': ['2020-08-26T08:00:00.00+05:30', 'death'], 'value': 59449},
{'key': ['2020-08-26T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3234474},
{'key': ['2020-08-27T08:00:00.00+05:30', 'active_cases'], 'value': 725991},
{'key': ['2020-08-27T08:00:00.00+05:30', 'cured'], 'value': 2523771},
{'key': ['2020-08-27T08:00:00.00+05:30', 'death'], 'value': 60472},
{'key': ['2020-08-27T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3310234},
{'key': ['2020-08-28T08:00:00.00+05:30', 'active_cases'], 'value': 742023},
{'key': ['2020-08-28T08:00:00.00+05:30', 'cured'], 'value': 2583948},
{'key': ['2020-08-28T08:00:00.00+05:30', 'death'], 'value': 61529},
{'key': ['2020-08-28T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3387500},
{'key': ['2020-08-29T08:00:00.00+05:30', 'active_cases'], 'value': 752424},
{'key': ['2020-08-29T08:00:00.00+05:30', 'cured'], 'value': 2648998},
{'key': ['2020-08-29T08:00:00.00+05:30', 'death'], 'value': 62550},
{'key': ['2020-08-29T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3463972},
{'key': ['2020-08-30T08:00:00.00+05:30', 'active_cases'], 'value': 765302},
{'key': ['2020-08-30T08:00:00.00+05:30', 'cured'], 'value': 2713933},
{'key': ['2020-08-30T08:00:00.00+05:30', 'death'], 'value': 63498},
{'key': ['2020-08-30T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3542733},
{'key': ['2020-08-31T08:00:00.00+05:30', 'active_cases'], 'value': 781975},
{'key': ['2020-08-31T08:00:00.00+05:30', 'cured'], 'value': 2774801},
{'key': ['2020-08-31T08:00:00.00+05:30', 'death'], 'value': 64469},
{'key': ['2020-08-31T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3621245},
{'key': ['2020-09-01T08:00:00.00+05:30', 'active_cases'], 'value': 785996},
{'key': ['2020-09-01T08:00:00.00+05:30', 'cured'], 'value': 2839882},
{'key': ['2020-09-01T08:00:00.00+05:30', 'death'], 'value': 65288},
{'key': ['2020-09-01T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3691166},
{'key': ['2020-09-02T08:00:00.00+05:30', 'active_cases'], 'value': 801282},
{'key': ['2020-09-02T08:00:00.00+05:30', 'cured'], 'value': 2901908},
{'key': ['2020-09-02T08:00:00.00+05:30', 'death'], 'value': 66333},
{'key': ['2020-09-02T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3769523},
{'key': ['2020-09-03T08:00:00.00+05:30', 'active_cases'], 'value': 815538},
{'key': ['2020-09-03T08:00:00.00+05:30', 'cured'], 'value': 2970492},
{'key': ['2020-09-03T08:00:00.00+05:30', 'death'], 'value': 67376},
{'key': ['2020-09-03T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3853406},
{'key': ['2020-09-04T08:00:00.00+05:30', 'active_cases'], 'value': 831124},
{'key': ['2020-09-04T08:00:00.00+05:30', 'cured'], 'value': 3037151},
{'key': ['2020-09-04T08:00:00.00+05:30', 'death'], 'value': 68472},
{'key': ['2020-09-04T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 3936747},
{'key': ['2020-09-05T08:00:00.00+05:30', 'active_cases'], 'value': 846395},
{'key': ['2020-09-05T08:00:00.00+05:30', 'cured'], 'value': 3107223},
{'key': ['2020-09-05T08:00:00.00+05:30', 'death'], 'value': 69561},
{'key': ['2020-09-05T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 4023179},
{'key': ['2020-09-06T08:00:00.00+05:30', 'active_cases'], 'value': 862320},
{'key': ['2020-09-06T08:00:00.00+05:30', 'cured'], 'value': 3180865},
{'key': ['2020-09-06T08:00:00.00+05:30', 'death'], 'value': 70626},
{'key': ['2020-09-06T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 4113811},
{'key': ['2020-09-07T08:00:00.00+05:30', 'active_cases'], 'value': 882542},
{'key': ['2020-09-07T08:00:00.00+05:30', 'cured'], 'value': 3250429},
{'key': ['2020-09-07T08:00:00.00+05:30', 'death'], 'value': 71642},
{'key': ['2020-09-07T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 4204613},
{'key': ['2020-09-08T08:00:00.00+05:30', 'active_cases'], 'value': 883697},
{'key': ['2020-09-08T08:00:00.00+05:30', 'cured'], 'value': 3323950},
{'key': ['2020-09-08T08:00:00.00+05:30', 'death'], 'value': 72775},
{'key': ['2020-09-08T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 4280422},
{'key': ['2020-09-09T08:00:00.00+05:30', 'active_cases'], 'value': 897394},
{'key': ['2020-09-09T08:00:00.00+05:30', 'cured'], 'value': 3398844},
{'key': ['2020-09-09T08:00:00.00+05:30', 'death'], 'value': 73890},
{'key': ['2020-09-09T08:00:00.00+05:30', 'total_confirmed_cases'],
'value': 4370128},
...]}
As you can see, the data is stored in a key-value pair format, where the key contains the timestamp and attribute name, while the value contains the number associated with that attribute. We can wrangle this data format into a more table-like structure so that we can convert this dictionary into a dataframe.
# Takes input from the JSON dictionary and converts it into
# a new dictionary with a different format. The attributes
# found in the input ("cases", "deaths", etc.) are keys in the
# new dictionary, and the values are a single key-value pair
# of time and the value corresponding to the attribute.
def extract_data(data):
extracted_data = {}
rows = data["rows"]
for row in rows:
time, attribute, value = extract_row(row)
if attribute not in extracted_data:
extracted_data[attribute] = {time : value}
else:
extracted_data[attribute][time] = value
return extracted_data
# Extracts the time, attribute, and value information
# from one row of the JSON dictionary. Used as a helper
# function in extract_data()
def extract_row(row):
time = row['key'][0]
attribute = row['key'][1]
value = row['value']
return time[:10], attribute, value
data_dict = extract_data(data)
df = pd.DataFrame(data_dict)
df
| active_cases | cured | death | total_confirmed_cases | |
|---|---|---|---|---|
| 2020-01-30 | 1 | 0 | 0 | 1 |
| 2020-02-02 | 2 | 0 | 0 | 2 |
| 2020-02-03 | 3 | 0 | 0 | 3 |
| 2020-03-02 | 5 | 0 | 0 | 5 |
| 2020-03-03 | 6 | 0 | 0 | 6 |
| ... | ... | ... | ... | ... |
| 2021-05-09 | 3736648 | 18317404 | 242362 | 22296414 |
| 2021-05-10 | 3745237 | 18671222 | 246116 | 22662575 |
| 2021-05-11 | 3715221 | 19027304 | 249992 | 22992517 |
| 2021-05-12 | 3704099 | 19382642 | 254197 | 23340938 |
| 2021-05-13 | 3710525 | 19734823 | 258317 | 23703665 |
435 rows × 4 columns
There is a problem here with the date ranges, and that is that they are not continuous! If we ever want to visualize the data, it will be important to have a continuous date range. We can do this by adding new rows for the missing days, and simply taking the previous row's values as the values for the new rows. We fill in the missing values in this way because all of the metrics in the dataset are cumulative.
# calculate date range index using lowest and highest dates
idx = pd.date_range(df.index.min(), df.index.max())
df.index = pd.DatetimeIndex(df.index)
# reindex the data using date range, and fill any missing dates
# with the previous date's row values
df = df.reindex(index=idx, method='ffill')
df
| active_cases | cured | death | total_confirmed_cases | |
|---|---|---|---|---|
| 2020-01-30 | 1 | 0 | 0 | 1 |
| 2020-01-31 | 1 | 0 | 0 | 1 |
| 2020-02-01 | 1 | 0 | 0 | 1 |
| 2020-02-02 | 2 | 0 | 0 | 2 |
| 2020-02-03 | 3 | 0 | 0 | 3 |
| ... | ... | ... | ... | ... |
| 2021-05-09 | 3736648 | 18317404 | 242362 | 22296414 |
| 2021-05-10 | 3745237 | 18671222 | 246116 | 22662575 |
| 2021-05-11 | 3715221 | 19027304 | 249992 | 22992517 |
| 2021-05-12 | 3704099 | 19382642 | 254197 | 23340938 |
| 2021-05-13 | 3710525 | 19734823 | 258317 | 23703665 |
470 rows × 4 columns
Now that we have the data in a managable format, we can start visualizing the data. Let's start by plotting the number of active cases, total number of confirmed cases, number of deaths, and number of people cured.
df.plot(y='active_cases')
<AxesSubplot:>
df.plot(y='total_confirmed_cases')
<AxesSubplot:>
df.plot(y='death')
<AxesSubplot:>
df.plot(y='cured')
<AxesSubplot:>
We can very clearly see from these plots that cases and deaths have been skyrocketing starting around late March to early April 2021. This is when a new strain of the virus came to India. But what made this new strain so difficult to handle compared to any previous ones, and is the situation different by state in India?
data2 = get_json('mohfw.json')
data2
med': 83,
'death': 3}},
{'id': '2020-03-31T20:30:00.00+05:30|kl',
'key': '2020-03-31T20:30:00.00+05:30|kl',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|kl',
'_rev': '1-738c6bfe4b3ccc8209aff155455f7246',
'state': 'kl',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 19,
'confirmed': 234,
'death': 1}},
{'id': '2020-03-31T20:30:00.00+05:30|la',
'key': '2020-03-31T20:30:00.00+05:30|la',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|la',
'_rev': '1-5d4c778dfb1cae0950f075924766c058',
'state': 'la',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 3,
'confirmed': 13,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|mh',
'key': '2020-03-31T20:30:00.00+05:30|mh',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|mh',
'_rev': '1-84e5ecdef08bb45e7ac52269bf4fb314',
'state': 'mh',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 39,
'confirmed': 216,
'death': 9}},
{'id': '2020-03-31T20:30:00.00+05:30|mn',
'key': '2020-03-31T20:30:00.00+05:30|mn',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|mn',
'_rev': '1-3b5dce76f3b5291f9ea00493a86e4ecc',
'state': 'mn',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 1,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|mp',
'key': '2020-03-31T20:30:00.00+05:30|mp',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|mp',
'_rev': '1-97090e559be4db777dc4211769ca9243',
'state': 'mp',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 47,
'death': 3}},
{'id': '2020-03-31T20:30:00.00+05:30|mz',
'key': '2020-03-31T20:30:00.00+05:30|mz',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|mz',
'_rev': '1-4310de10742e9252e08318f47391158f',
'state': 'mz',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 1,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|or',
'key': '2020-03-31T20:30:00.00+05:30|or',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|or',
'_rev': '1-e75ec5c62f10e293d4bb6a80817ab2eb',
'state': 'or',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 3,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|pb',
'key': '2020-03-31T20:30:00.00+05:30|pb',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|pb',
'_rev': '1-c69087dff64a71adec17bb61d74e845a',
'state': 'pb',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 1,
'confirmed': 41,
'death': 3}},
{'id': '2020-03-31T20:30:00.00+05:30|py',
'key': '2020-03-31T20:30:00.00+05:30|py',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|py',
'_rev': '1-5d7dbab67042c3234890ab6bd15f5d92',
'state': 'py',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 1,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|rj',
'key': '2020-03-31T20:30:00.00+05:30|rj',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|rj',
'_rev': '1-b6f2f51990c53dc7c94fd9a2d282d22c',
'state': 'rj',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 3,
'confirmed': 74,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|tg',
'key': '2020-03-31T20:30:00.00+05:30|tg',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|tg',
'_rev': '1-e01b13cefed57a525bd763a0d41874f8',
'state': 'tg',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 1,
'confirmed': 79,
'death': 1}},
{'id': '2020-03-31T20:30:00.00+05:30|tn',
'key': '2020-03-31T20:30:00.00+05:30|tn',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|tn',
'_rev': '1-bd63b887b78543bc53779a054b92c03f',
'state': 'tn',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 4,
'confirmed': 74,
'death': 1}},
{'id': '2020-03-31T20:30:00.00+05:30|up',
'key': '2020-03-31T20:30:00.00+05:30|up',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|up',
'_rev': '1-5043e1e7d2f4c63d47b048cd51c68007',
'state': 'up',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 14,
'confirmed': 101,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|ut',
'key': '2020-03-31T20:30:00.00+05:30|ut',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|ut',
'_rev': '1-fdd1cfc9c6c344a1e08528b6e6c24c9e',
'state': 'ut',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 2,
'confirmed': 7,
'death': 0}},
{'id': '2020-03-31T20:30:00.00+05:30|wb',
'key': '2020-03-31T20:30:00.00+05:30|wb',
'value': {'_id': '2020-03-31T20:30:00.00+05:30|wb',
'_rev': '1-7237404564471b48490df1671211ce61',
'state': 'wb',
'type': 'cases',
'report_time': '2020-03-31T20:30:00.00+05:30',
'source': 'mohfw',
'cured': 0,
'confirmed': 26,
'death': 2}},
{'id': '2020-04-01T09:00:00.00+05:30|an',
'key': '2020-04-01T09:00:00.00+05:30|an',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|an',
'_rev': '1-2dea80a1f3a4f88c534bb0b81c041a9f',
'state': 'an',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 10,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ap',
'key': '2020-04-01T09:00:00.00+05:30|ap',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ap',
'_rev': '1-a2523deee84ed579fcd4accc01cddc60',
'state': 'ap',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 83,
'cured': 1,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|br',
'key': '2020-04-01T09:00:00.00+05:30|br',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|br',
'_rev': '1-389121d97b1037ebe1965b8e273a1d94',
'state': 'br',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 23,
'cured': 0,
'death': 1,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ch',
'key': '2020-04-01T09:00:00.00+05:30|ch',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ch',
'_rev': '1-cc57dac9b601782638bd5d580c9130c7',
'state': 'ch',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 13,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ct',
'key': '2020-04-01T09:00:00.00+05:30|ct',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ct',
'_rev': '1-cbb06cf9ea67ae1e15c51925cf64a121',
'state': 'ct',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 9,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|dl',
'key': '2020-04-01T09:00:00.00+05:30|dl',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|dl',
'_rev': '1-ddeb0178f100e20ccc0ffbf91fc46300',
'state': 'dl',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 120,
'cured': 6,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ga',
'key': '2020-04-01T09:00:00.00+05:30|ga',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ga',
'_rev': '1-5e5948536ac7990fa7ddf6718af00590',
'state': 'ga',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 5,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|gj',
'key': '2020-04-01T09:00:00.00+05:30|gj',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|gj',
'_rev': '1-7eb118d020f1bf214774c8715b40e8c5',
'state': 'gj',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 74,
'cured': 5,
'death': 6,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|hp',
'key': '2020-04-01T09:00:00.00+05:30|hp',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|hp',
'_rev': '1-8a269427a7b73ac2d356eee8d9137159',
'state': 'hp',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 3,
'cured': 0,
'death': 1,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|hr',
'key': '2020-04-01T09:00:00.00+05:30|hr',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|hr',
'_rev': '1-ab730d43312b42fca5e970c93a18102a',
'state': 'hr',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 43,
'cured': 21,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|jk',
'key': '2020-04-01T09:00:00.00+05:30|jk',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|jk',
'_rev': '1-79a6c136cb7100b0d0a9ff0fba8fe6e0',
'state': 'jk',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 55,
'cured': 2,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ka',
'key': '2020-04-01T09:00:00.00+05:30|ka',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ka',
'_rev': '1-d561828b485767e9269c3cc5906e768e',
'state': 'ka',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 101,
'cured': 8,
'death': 3,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|kl',
'key': '2020-04-01T09:00:00.00+05:30|kl',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|kl',
'_rev': '1-82c1907ab22dc77a7f959d339e7b41b5',
'state': 'kl',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 241,
'cured': 23,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|la',
'key': '2020-04-01T09:00:00.00+05:30|la',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|la',
'_rev': '1-ee19cce132d4c39c7caa35d3ec28534e',
'state': 'la',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 13,
'cured': 3,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|mh',
'key': '2020-04-01T09:00:00.00+05:30|mh',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|mh',
'_rev': '1-382abb0d6b69fb84a3803ef656ea6633',
'state': 'mh',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 302,
'cured': 39,
'death': 9,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|mn',
'key': '2020-04-01T09:00:00.00+05:30|mn',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|mn',
'_rev': '1-8b57e9d96f61449dea574ebe10b37d02',
'state': 'mn',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 1,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|mp',
'key': '2020-04-01T09:00:00.00+05:30|mp',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|mp',
'_rev': '1-b7b10a85510a2439c6d47f7e7765fa7a',
'state': 'mp',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 47,
'cured': 0,
'death': 3,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|mz',
'key': '2020-04-01T09:00:00.00+05:30|mz',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|mz',
'_rev': '1-90823eed5c73b5f6a9347c73f977daa9',
'state': 'mz',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 1,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|or',
'key': '2020-04-01T09:00:00.00+05:30|or',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|or',
'_rev': '1-a48f7ef5aa70d851b08614dc3f74524d',
'state': 'or',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 4,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|pb',
'key': '2020-04-01T09:00:00.00+05:30|pb',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|pb',
'_rev': '1-0c657a0f0da1cee94b920adfbf2df84a',
'state': 'pb',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 41,
'cured': 1,
'death': 3,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|py',
'key': '2020-04-01T09:00:00.00+05:30|py',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|py',
'_rev': '1-bde26436481e0bad9c4dc99ca1aafe6a',
'state': 'py',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 1,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|rj',
'key': '2020-04-01T09:00:00.00+05:30|rj',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|rj',
'_rev': '1-ada744c0d20dcbaadd4d3d795dcd5cc2',
'state': 'rj',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 93,
'cured': 3,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|tg',
'key': '2020-04-01T09:00:00.00+05:30|tg',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|tg',
'_rev': '1-022d433ab82f68ccd6502ffcac7d2d3f',
'state': 'tg',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 94,
'cured': 1,
'death': 3,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|tn',
'key': '2020-04-01T09:00:00.00+05:30|tn',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|tn',
'_rev': '1-ef1fce374096cc15f7982814fcbc5fe2',
'state': 'tn',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 124,
'cured': 4,
'death': 1,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|up',
'key': '2020-04-01T09:00:00.00+05:30|up',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|up',
'_rev': '1-fe40869e34040aefa29a5d1f526463b7',
'state': 'up',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 103,
'cured': 14,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|ut',
'key': '2020-04-01T09:00:00.00+05:30|ut',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|ut',
'_rev': '1-cd6a945c8269e68efedf3fcb0f23ec7c',
'state': 'ut',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 7,
'cured': 2,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T09:00:00.00+05:30|wb',
'key': '2020-04-01T09:00:00.00+05:30|wb',
'value': {'_id': '2020-04-01T09:00:00.00+05:30|wb',
'_rev': '1-00d108c5eb80bf944df10dfda313da20',
'state': 'wb',
'report_time': '2020-04-01T09:00:00.00+05:30',
'confirmed': 26,
'cured': 0,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|an',
'key': '2020-04-01T19:30:00.00+05:30|an',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|an',
'_rev': '1-2bd4e7faf192a281c11adf36efd2bedb',
'state': 'an',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 10,
'cured': 1,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|ap',
'key': '2020-04-01T19:30:00.00+05:30|ap',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|ap',
'_rev': '1-be02347603c55bcfef15649bef3002ea',
'state': 'ap',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 83,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|as',
'key': '2020-04-01T19:30:00.00+05:30|as',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|as',
'_rev': '1-7d3325ff85d493b417c1bb2b15b847fe',
'state': 'as',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 1,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|br',
'key': '2020-04-01T19:30:00.00+05:30|br',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|br',
'_rev': '1-3445975f9c3f7b3cef908446bd10aa58',
'state': 'br',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 23,
'cured': 0,
'death': 1,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|ch',
'key': '2020-04-01T19:30:00.00+05:30|ch',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|ch',
'_rev': '1-c1c7ae8286b3f8481d73a053ffd44b41',
'state': 'ch',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 16,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|ct',
'key': '2020-04-01T19:30:00.00+05:30|ct',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|ct',
'_rev': '1-7205d950d9a8d58c07aedd5357891064',
'state': 'ct',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 9,
'cured': 2,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|dl',
'key': '2020-04-01T19:30:00.00+05:30|dl',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|dl',
'_rev': '1-f81fbf23d30e0b80b6b98b482b2fb158',
'state': 'dl',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 152,
'cured': 6,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|ga',
'key': '2020-04-01T19:30:00.00+05:30|ga',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|ga',
'_rev': '1-7b9f6e0eed15aeb4b6e6b0ad23f5ffbb',
'state': 'ga',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 5,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|gj',
'key': '2020-04-01T19:30:00.00+05:30|gj',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|gj',
'_rev': '1-f6191023e7cd85e3ff74fd5aa38fc7f2',
'state': 'gj',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 82,
'cured': 5,
'death': 6,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|hp',
'key': '2020-04-01T19:30:00.00+05:30|hp',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|hp',
'_rev': '1-22340707ca40c15f2804cd6ff4698c8a',
'state': 'hp',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 3,
'cured': 0,
'death': 1,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|hr',
'key': '2020-04-01T19:30:00.00+05:30|hr',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|hr',
'_rev': '1-c6c6ff99fd8af2e87d4a3410ff782386',
'state': 'hr',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 43,
'cured': 21,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|jh',
'key': '2020-04-01T19:30:00.00+05:30|jh',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|jh',
'_rev': '1-9d1525975200a47ab5704a703841c18e',
'state': 'jh',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 1,
'cured': 0,
'death': 0,
'source': 'mohfw',
'type': 'cases'}},
{'id': '2020-04-01T19:30:00.00+05:30|jk',
'key': '2020-04-01T19:30:00.00+05:30|jk',
'value': {'_id': '2020-04-01T19:30:00.00+05:30|jk',
'_rev': '1-9ac47b472af6dc62c660f841c44230a9',
'state': 'jk',
'report_time': '2020-04-01T19:30:00.00+05:30',
'confirmed': 62,
'cured': 2,
'death': 2,
'source': 'mohfw',
'type': 'cases'}},
...]}
data2.keys()
dict_keys(['total_rows', 'offset', 'rows'])
data2['rows'][5].keys()
dict_keys(['id', 'key', 'value'])
data2['rows'][5]['value'].keys()
dict_keys(['_id', '_rev', 'report_time', 'state', 'confirmed_india', 'confirmed_foreign', 'cured', 'death', 'source', 'type', 'confirmed'])
def add_tuples(a, b):
return tuple(map(lambda i, j: i + j, a, b))
def get_state_from_abbrev(abbrev):
states = {
"ap": "Andhra Pradesh",
"ar": "Arunachal Pradesh",
"as": "Assam",
"br": "Bihar",
"ct": "Chhattisgarh",
"ga": "Goa",
"gj": "Gujarat",
"hr": "Haryana",
"hp": "Himachal Pradesh",
"jh": "Jharkhand",
"ka": "Karnataka",
"kl": "Kerala",
"mp": "Madhya Pradesh",
"mh": "Maharashtra",
"mn": "Manipur",
"ml": "Meghalaya",
"mz": "Mizoram",
"nl": "Nagaland",
"or": "Odisha",
"pb": "Punjab",
"rj": "Rajasthan",
"sk": "Sikkim",
"tn": "Tamil Nadu",
"tg": "Telangana",
"tr": "Tripura",
"ut": "Uttarakhand",
"up": "Uttar Pradesh",
"wb": "West Bengal",
"an": "Andaman and Nicobar Islands",
"ch": "Chandigarh",
"dn": "Dadra and Nagar Haveli",
"dd": "Daman and Diu",
"dl": "Delhi",
"jk": "Jammu and Kashmir",
"la": "Ladakh",
"ld": "Lakshadweep",
"py": "Pondicherry",
"dn_dd": "Dadra and Nagar Haveli and Daman and Diu",
"unassigned": "unassigned"
}
return states[abbrev]
def extract_data2(data):
extracted_data = {}
rows = data["rows"]
for row in rows:
time, state, confirmed, cured, death = extract_row2(row)
if state not in extracted_data:
extracted_data[state] = (confirmed, cured, death)
else:
res = add_tuples(extracted_data[state], (confirmed, cured, death))
extracted_data[state] = res
return extracted_data
# Extracts the time, attribute, and value information
# from one row of the JSON dictionary. Used as a helper
# function in extract_data()
def extract_row2(row):
values_dict = row['value']
time = values_dict['report_time']
state = get_state_from_abbrev(values_dict['state'])
confirmed = values_dict['confirmed']
cured = values_dict['cured']
death = values_dict['death']
return time[:10], state, confirmed, cured, death
extracted2 = extract_data2(data2)
extracted2
{'Kerala': (200185603, 177894118, 766581),
'Delhi': (158725093, 147896560, 2741459),
'Telangana': (75798728, 69580382, 432535),
'Rajasthan': (77767638, 69179429, 700476),
'Haryana': (66067411, 60344316, 698864),
'Jammu and Kashmir': (30756599, 27720245, 469131),
'Karnataka': (242380060, 216792332, 3144790),
'Ladakh': (2307394, 2084230, 28077),
'Maharashtra': (589598167, 512544984, 13590012),
'Punjab': (47764245, 42219368, 1406438),
'Tamil Nadu': (223058512, 208216246, 3261644),
'Uttar Pradesh': (160155913, 143267909, 2199456),
'Andhra Pradesh': (228887306, 215214582, 1848944),
'Uttarakhand': (23199910, 20397560, 370919),
'Odisha': (82390133, 77558828, 426009),
'West Bengal': (133436340, 123072064, 2330712),
'Pondicherry': (9996027, 9076464, 163604),
'Chandigarh': (5394798, 4840252, 77743),
'Chhattisgarh': (75304232, 66298558, 875894),
'Gujarat': (70290062, 62010591, 1330100),
'Himachal Pradesh': (12567361, 11065632, 198464),
'Madhya Pradesh': (66056188, 59307071, 976388),
'Bihar': (69575592, 64292363, 381133),
'Manipur': (6347465, 5796243, 72359),
'Mizoram': (1009532, 910925, 1770),
'Goa': (13652667, 12301268, 188057),
'Andaman and Nicobar Islands': (1283574, 1211641, 16203),
'Assam': (56413792, 52528287, 260216),
'Jharkhand': (31400674, 28349151, 296292),
'Arunachal Pradesh': (3962184, 3686314, 12013),
'Tripura': (8320110, 7728375, 92857),
'Nagaland': (2846345, 2593039, 17581),
'Meghalaya': (3074017, 2797072, 30784),
'Dadra and Nagar Haveli': (186, 14, 0),
'Sikkim': (1390467, 1243984, 26879),
'unassigned': (280806, 0, 0),
'Daman and Diu': (2, 0, 0),
'Dadra and Nagar Haveli and Daman and Diu': (1022337, 940814, 662),
'Lakshadweep': (107535, 73104, 164)}
df2 = pd.DataFrame(extracted2)
df2 = df2.transpose()
df2['state'] = df2.index
df2.columns = ['cases', 'cured', 'deaths', 'state']
df2.drop('unassigned', inplace=True)
df2
| cases | cured | deaths | state | |
|---|---|---|---|---|
| Kerala | 200185603 | 177894118 | 766581 | Kerala |
| Delhi | 158725093 | 147896560 | 2741459 | Delhi |
| Telangana | 75798728 | 69580382 | 432535 | Telangana |
| Rajasthan | 77767638 | 69179429 | 700476 | Rajasthan |
| Haryana | 66067411 | 60344316 | 698864 | Haryana |
| Jammu and Kashmir | 30756599 | 27720245 | 469131 | Jammu and Kashmir |
| Karnataka | 242380060 | 216792332 | 3144790 | Karnataka |
| Ladakh | 2307394 | 2084230 | 28077 | Ladakh |
| Maharashtra | 589598167 | 512544984 | 13590012 | Maharashtra |
| Punjab | 47764245 | 42219368 | 1406438 | Punjab |
| Tamil Nadu | 223058512 | 208216246 | 3261644 | Tamil Nadu |
| Uttar Pradesh | 160155913 | 143267909 | 2199456 | Uttar Pradesh |
| Andhra Pradesh | 228887306 | 215214582 | 1848944 | Andhra Pradesh |
| Uttarakhand | 23199910 | 20397560 | 370919 | Uttarakhand |
| Odisha | 82390133 | 77558828 | 426009 | Odisha |
| West Bengal | 133436340 | 123072064 | 2330712 | West Bengal |
| Pondicherry | 9996027 | 9076464 | 163604 | Pondicherry |
| Chandigarh | 5394798 | 4840252 | 77743 | Chandigarh |
| Chhattisgarh | 75304232 | 66298558 | 875894 | Chhattisgarh |
| Gujarat | 70290062 | 62010591 | 1330100 | Gujarat |
| Himachal Pradesh | 12567361 | 11065632 | 198464 | Himachal Pradesh |
| Madhya Pradesh | 66056188 | 59307071 | 976388 | Madhya Pradesh |
| Bihar | 69575592 | 64292363 | 381133 | Bihar |
| Manipur | 6347465 | 5796243 | 72359 | Manipur |
| Mizoram | 1009532 | 910925 | 1770 | Mizoram |
| Goa | 13652667 | 12301268 | 188057 | Goa |
| Andaman and Nicobar Islands | 1283574 | 1211641 | 16203 | Andaman and Nicobar Islands |
| Assam | 56413792 | 52528287 | 260216 | Assam |
| Jharkhand | 31400674 | 28349151 | 296292 | Jharkhand |
| Arunachal Pradesh | 3962184 | 3686314 | 12013 | Arunachal Pradesh |
| Tripura | 8320110 | 7728375 | 92857 | Tripura |
| Nagaland | 2846345 | 2593039 | 17581 | Nagaland |
| Meghalaya | 3074017 | 2797072 | 30784 | Meghalaya |
| Dadra and Nagar Haveli | 186 | 14 | 0 | Dadra and Nagar Haveli |
| Sikkim | 1390467 | 1243984 | 26879 | Sikkim |
| Daman and Diu | 2 | 0 | 0 | Daman and Diu |
| Dadra and Nagar Haveli and Daman and Diu | 1022337 | 940814 | 662 | Dadra and Nagar Haveli and Daman and Diu |
| Lakshadweep | 107535 | 73104 | 164 | Lakshadweep |
We can see there are some missing values (a lot actually!) that are labelled as 'unassigned.' For now we will drop these values from the table, although further analysis might be required to truly understand the effect of these values on the data overall.
# Make a chloropleth map!!
import folium
map = folium.Map(location=[22.71, 79.04], zoom_start=5)
map
def color_map(column, scheme):
map_osm = folium.Map(location=[22.71, 79.04], zoom_start=5)
map_osm
folium.Choropleth(
geo_data="https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson",
name="choropleth",
data=df2,
columns=["state", column],
key_on='properties.ST_NM',
fill_color=scheme,
fill_opacity=0.7,
line_opacity=0.2,
legend_name=f"COVID-19 {column}",
).add_to(map_osm)
folium.LayerControl().add_to(map_osm)
return map_osm
color_map("deaths", "Reds")
color_map("cured", "Greens")
color_map("cases", "Purples")
From these plots, we can seen that things seem pretty dire in Maharashta, the state where Mumbai is located. Maharashta has the greatest number of deaths, cases, and cured people. This makes a lot of sense because Mumbai, India's largest city, is located in Maharashta, and so there are some highly dense city areas there. In a city, especially a dense one, it is more difficult to socially distance, making it easier for COVID-19 to spread.
Looking at the other states, it appears that states in Southern India have it worse in number of cases, deaths, and cured people compared to Northern India. Kerala, which is in Southern India, is an interesting case study, because while it shares similar numbers in terms of cases and number of people cured, it has significantly fewer number of deaths than its neighboring states, such as Tamil Nadu and Karnataka.